Role of the sampling weight in evaluating classical time autocorrelation functions 
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We analyze how the choice of the samphng weight affects the efficiency of the Monte Carlo evaluation of 
classical time autocorrelation functions. Assuming uncorrelated sampling or sampling with constant correlation 
length, we propose a sampling weight for which the number of trajectories needed for convergence is inde- 
pendent of the correlated quantity, dimensionality, dynamics, and phase-space density. In contrast, it is shown 
that the computational cost of the "standard" intuitive algorithm which samples directly from the phase-space 
density may scale exponentially with the number of degrees of freedom. Yet, for the stationary Gaussian distri- 
bution of harmonic systems and for the autocorrelation function of a linear function of phase-space coordinates, 
the computational cost of this standard algorithm is also independent of dimensionality. 



Introduction: Time-correlation functions. Many dynami- 
cal properties of stationary systems as well as the response 
of such systems to weak perturbations can be infened from 
time autoconelation functions Examples include the 

optical absorption line shapes computed from the dipole time 
autocorrelation function, the diffusion coefficient computed 
from the velocity time autocorrelation function, and various 
relaxation properties [3]. More general time correlation func- 
tions are in fact the principal ingredients of semiclassical fllH 
and path-integral [6-11] calculations of quantum dynamical 
properties. Trajectory-based methods for computing time cor- 
relation functions, however, often become too expensive in 
many-dimensional systems. Yet, dimensionality-independent 
algorithms have been found for special correlation functions, 
such as classical [T^] and semiclassical 1 1 3] fidelity [ 14] . Mo- 
tivated by the success in these special cases and by the im- 
portance of correlation functions in many areas of physics, in 
this Letter we explore how these functions can be computed 
more efficiently in general. In particular, we propose a sam- 
pling weight for which the number of trajectories needed for 
convergence of any classical normalized time autocorrelation 
function is independent of dimensionality. 

Quantum mechanically, the unnormalized time autocorrela- 
tion function (t) of a vector operator A may be written 
as 



(t) = Tr(p"A".A*), 



(1) 



where p° is the density operator of the state. A" is the operator 
evaluated at time t — 0, A'^ — g^nt/fi^^-iHt/h ^j^g oper- 
ator A evolved with Hamiltonian H for time t, and subscript 
"u" emphasizes that the coiTelation function is not normal- 
ized. The classical analog C^^ {t) of the quantum correlation 
function ^ is 



(t) = h-° / dxp° (x) A° (x) • A* (x) 



(2) 



where x :— {q,p) is the 2Z?-dimensional phase-space co- 
ordinate, (x) is the initial phase-space density. A" (x) is 
the classical observable A evaluated at time t = 0, and 
A* (x) — e^^*A*' (x) is this function A evolved classically 
for time t with the Liouville operator L = {H,-}. Note 



that besides a 3-dimensional vector (such as the molecular 
dipole /x), A can also be a scalar (A) or a higher-dimensional 
phase-space vector To make the connection between classi- 
cal and quantum mechanical expressions explicit, the phase- 
space volume is measured in units of h^. Since our focus 
is on classical coiTelation functions, superscript CL will be 
omitted from this point forward. 

The shape of the autocorrelation function is often more in- 
teresting than its overall magnitude f]3\, and hence one often 
computes the time autocorrelation C{t) which is normalized 
with respect to its initial value: 



C{t) 



Cu jt) 



(3) 



Algorithms. Most common methods for evaluating Eqs. (|2|l 
and (|3) in many-dimensional cases are based on classical tra- 
jectories. Two general approaches are cunently used 1) 
the direct approach in which initial conditions for many trajec- 
tories are sampled from the stationary distribution p and the 
trajectories are subsequently evolved simultaneously in time; 
and 2) the single-trajectory approach in which only one trajec- 
tory is evolved in time and the desired autocon^elation function 
is computed as an average of many congelation functions com- 
puted using the same trajectory but initiated at different times. 
The direct approach is more general and does not require the 
ergodicity of the time evolution, whereas the single trajectory 
approach is generally simpler as it avoids explicit sampling of 
p. Here we explore modifications of the direct approach using 
generalized sampling weights. 

We start by expressing the conelation function (|2) in terms 
of trajectories. 



Cu(t) 



dx'>p {x°) A {x'^) ■ A {x- 



(4) 



where x* := $*(a;°) is the phase-space coordinate at time t of 
a trajectory of the Hamiltonian flow $* with initial condition 
x". We further rewrite Eq. (0) in a form suitable for Monte 
Carlo evaluation, i.e., as an average 



{E{x',t)) 



w 



Jdx°E{x^,t)W{x'^) 
JdxOW{xO) 



(5) 
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where the positive definite function W is the samphng weight 
and E is the estimator. In the Monte Carlo method, average 
(|5]l is evaluated numerically as an average 



1 ^ 



(6) 



over N trajectories whose initial conditions are sampled 
from the weight W. 

The convergence rate of the sum (|6) usually depends on 
W. Among many possible weights W, the form of Eq. (|4|l 
immediately suggests the following three: W{x) = p{x), 
W{x) = p{x) |A(a;)|, and W{x) — p{x)A{x)^. These three 
weights lead to three different algorithms, which may be gen- 
erally written as 



Cu,w {t)^Iw {Ew 



(7) 



where Iw := h ^ J W{x)dx denotes the norm of W and the 
estimators are 



Ep{x°,t)=A{x°)-A{x-') 



A(x") - A (a;-*) 
|A(xO)| ^ 

A{x°)-A{x-*) 
|A(xO)|2 ■ 



(8) 
(9) 

(10) 



Substitution of Eq. Q into the definition (O yields a Monte 
Carlo prescription for the normalized correlation function: 



Cw (t) = 



{Ew{x°,t))^ 
(i?w(x°,0))^ 



(11) 



Note that since Ep^^ O) = 1, no normalization is needed 
for the pA^ algorithm. The two averages in Eq. (fTTT l may be 
evaluated either with two independent Monte Carlo simula- 
tions or during a single Monte Carlo simulation. Here we con- 
sider only the latter possibility, as it is computationally faster 
and normaUzes both Cp (0) and Cp| a| (0) exactly. 

Statistical errors. The three algorithms differ by the sam- 
pling weight W used and consequently also by the estima- 
tor Ew- The computational cost of all three algorithms is 
O (c-^iV), where N is the number of trajectories. At the 
time step used, and c the combined cost of a single evalua- 
tion of the force (needed for the dynamics) and of the esti- 
mator Ew- Usually, the cost of evaluating the estimator is or 
can be made negligible to that of evaluating the force. There- 
fore the costs of the algorithms differ mainly in the number 
N of trajectories needed to achieve a desired precision (i.e., 
discretization error) fJdisci-- 

Alternatively, the algorithms can be compared by evaluat- 
ing the discretization errors (Xdiscr.iv resulting from a given 
number TV of trajectories. For an unbiased estimator, the 
discretization error tJdiscr is equal to the statistical error aw, 
where aw{N, tf = CwiN,t)^ - Cw{N,tf and the over- 
line denotes an average over an infinite number of simulations 



with different sets of N trajectories. Assuming for now that 
the N trajectories are uncorrected, one can show that the error 
of the unnormalized Cy [t) satisfies 



a,,w{N.tf^^-f 



'Ew{x\tf)^^{Ew{x\t))l^ 

(12) 

For W = pA^, the error of normalized C{t) satisfies an anal- 
ogous relation obtained by removing factors of Iw from Eq. 
( fT2] i. Statistical errors of algorithms with weights p and p | A|, 
which must be normalized according to Eq. (fTTT l. are found 
from the formula for the statistical error of a ratio of random 
variables: 



\sJt) 

In our case, S 
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Cy.,w {N,t) and T = Cu.iy (iV,0). Realiz- 
Cu (i) we obtain the following general 



ing that Cy,^w{N,t) 
expression for the statistical errors of the three algorithms 



(Tw {N,tY 



1 



Nd 



w 



awC{tf - 2bwC{t)+cw 



(14) 



where Op = {\A^*\* p/W)p, bp = (| A^j^ (AO • A*) p/W")p, 

Cp ^ {{A° ■A')'^p/W)p,dp = (|A0|%/iy)2, and an abbre- 
viated notation A* := A{x^*') was used. The special cases 
are obtained by replacing W with p, p |A|, or pA'^ in these 
expressions. 

For W = pA^, the coefficients can be rearranged as 
apA- = -dpA-, hpA- = 0, CpA2 = ((A0.A*)V|A0|2)^, 
and dpA2 = (jA^p)^. Using the Cauchy-Schwarz inequality 

(A° • A*)^ < I A° PI A* p in the expression for CpA^ and the 
fact that for stationary distributions (|A°p)^ = (|A*p)^, 

we find that CpA^ < (A {x^*)'^)p = dpA^ and realize that for 
the weight pA^ the upper bound for the statistical error de- 
pends only on and the value of the autocorrelation function 

at): 



y'pA^ {N,t)<-{\ 



C{tf 



(15) 



In particular, the error does not explicitly depend on the di- 
mensionality D of the system, chaoticity of its dynamics, the 
nature of the observable A, or time t. This remarkable fact is 
the main thesis of this paper. 

Special cases. One cannot make a similar general state- 
ment about either of the algorithms using weight p or p |A| . 
We therefore turn to two special cases permitting analyti- 
cal evaluation of the statistical errors. Both examples in- 
volve a many-dimensional harmonic oscillator (HO) H = 
(1/2)(p^/to + kq^) and its stationary Gaussian distribution 

p{x) ^ [2tanh(w/2)]-°exp[-tanh(M/2)((7Va^+P^a^/^^)]j 

(16) 

given by the Wigner transform of the Boltzmann density op- 
erator Above, u := fShuj, uP' — k/m, a? = h/{mLu). [Note 
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that the ground state density and the classical Boltzmann dis- 
tribution can be obtained as the limits of Eq. ( fTSI l for /? — > cxo 
and /3 0, respectively.] The two examples differ in the 
choice of the observable A. 

Exponential growth of a with D. First consider A to be the 
product of coordinates: A — qiq2 ■ ■ ■ qo- The statistical error 
for W = pA^ is described by Eq. ( fTSl l in full generality and 
thus is independent of D. On the other hand, straightforward 
but somewhat tedious calculations using Eq. (fT4l) show that 
statistical errors for both weights p and p\A\ grow exponen- 
tially with the number of dimensions D: 



1 

N 



(17) 



1+ yatf 



D 



(18) 



The fact that for W = p and p\A\ there exist observables 
for which the error grows exponentially with D is our second 
main result. Similar behavior of a is expected for any mul- 
tiplicatively separable function A of phase-space coordinates, 
such as the Gaussian A = cxp(— g^/a^). 

Independence of D. Yet, the situation is not always so 
bleak. Consider the correlated function A = p,' ■ q to he a 
linear function of coordinates q (p' is a /^-dimensional vec- 
tor). In this important special case, all three sampling methods 
have statistical errors independent of dimensionality: 



p or pA' 



{N,tf 



(19) 
(20) 



The proof of Eq. ( |20] l for weight p\A\ is somewhat involved 
and was done only for the case pi = ■ ■ ■ — po. On the 
other hand, Eq. ( fT9] l remains valid even for HOs with different 
frequencies in different dimensions. Note that the statistical 
error is slightly lower for W = p\A\ than for W — p or pA^. 

Sampling methods and correlation length. Before pre- 
senting numerical examples, let us briefly discuss the sam- 
pling methods. In many dimensions, sampling from a gen- 
eral weight W is often performed with the Metropolis method 



1 171 - 11911 . Two variants are used here: The "original" Metropo- 
lis method proposes the new point Xnew using a random walk 
step from the last accepted point Xo\d\ a;new is accepted with 



probability pa 



min[l^(a;„ow)/W^(a;oid),l]- If a^n. 



rejected, the last accepted point Xo\a is duplicated. In the 
"product" Metropolis method, W is factorized as = Y Z, 
where Y can be sampled "directly" to propose a new point 
a^new which is subsequently accepted with probability pacc — 
min[Z(x„ow)/-^(a;oid), !]■ 

Unfortunately, except for a few distributions W (such as 
the uniform or normal distributions, which may be sampled 



"directly"), points generated by Metropolis methods are cor- 
related, leading to a correlation length A^coir > 1 between 
samples. This increases the statistical error for a given num- 
ber of samples N . As a consequence, in all of our analytical 
expressions, N should be replaced by N/N^on-, which can af- 
fect (slightly) the dependence of a on D. An important factor 
increasing iVcorr is the rejection of proposed moves, which re- 
sults in exactly identical samples. In a properly designed code, 
however, these repeated samples do not increase the compu- 
tational cost; they are accounted for by increasing the statisti- 
cal weight of the original (not yet duplicated) sample. Thus, 
strictly speaking, the efficiency of a sampling algorithm de- 
pends on the number A'^iniq of unique trajectories needed for 
convergence rather than on the total number N of trajectories. 
While we took Nco„ into account in the numerical calcula- 
tions, a detailed analysis of iVcon, which can both increase 
(slowly) or decrease (slowly) with D, is beyond the scope of 
this paper 

Numerical results. We first confirmed our analytical results 
for HOs numerically using k — m — h — /3 = 1. Numerical 
statistical errors were estimated by averaging these errors over 
100 independent simulations, each with the same number of 
unique trajectories iVuniq = 5 x 10^. In order to compare with 
the analytical results, the effect of correlation was removed by 
converting the numerical statistical error cr to an error per tra- 
jectory (Ji {N/Nco„y^^a. The correlation lengths iVcorr 
were estimated using the method of block averages ||20 tl. 

Figure[T]shows that for A — qiq2 ■ ■ ■ qo, the error cti grows 
exponentially with D for both weights p and p\A\ while it is 
independent of D for W ~ pA? . Moreover, numerical results 
agree with the analytical predictions (flSl l. (fTTt . and (fTsT l. The 
original Metropolis method was used since the acceptance rate 
of the product Metropolis method was prohibitively low for 
high D. The step size of the random walk was the same for 
all three weights but varied weakly with D for the sake of 
a reasonable acceptance rate. [Note that in our calculations 
fpA2 — (-^con-/^)^^^o'i,pA2 itself grew slightly with D due 
to a slow growth of the correlation length iVcorr with D. For 
W = p, iVcorr decreased slightly with D and for = p jAj it 
stayed approximately constant, but these effects did not cancel 
the overall exponential growth of the error Even though A'corr 
can be varied to some extent by modifying the step size of the 
random walk, this was not explored in detail here.] 

Figure |2] compares the analytical predictions with numer- 
ically computed errors for A ~ p' ■ q, where p' is a D- 
dimensional vector with all entries equal to 1. Such A can 
be interpreted as a linear approximation to the electric dipole 
of a nonpolar molecule. Figure |2] confirms that the statistical 
error ai is independent of D for all three algorithms. Initial 
conditions were sampled using the product Metropolis algo- 
rithm with W = YZ and Y = pin all cases. Function Z used 
in the acceptance criterion was equal to 1, \A\, and A"^, for 
W = p, p\A\, and pA^, respectively. Therefore, for W ~ p, 
Ncoi-r = 1 and N = iVuniq, while for = p |A| and pA^, 

TVcorr > 1 and iV > iVu„iq. 

Finally, we used the three algorithms to calculate the 
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statistical errors for A = y^q^-.-qf^ and C = 0.5 High energy part of the vibrational spectrum of azulene 




Figure 1 : Expected statistical error per trajectory of the autocorre- 
lation function C {t) of the function A = qiq2 ■ ■ ■ qn in a many- 
dimensional harmonic oscillator. The statistical error is indepen- 
dent of dimensionality for the algorithm with weight W = pA^ and 
grows exponentially with D for the other two weights. Time t was 
chosen separately for each D so that C{t) ~ 0.5. 
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Figure 2: Expected statistical error per trajectory of the autocorre- 
lation function C'{t) of the linear operator A = /i' ■ q in a. many- 
dimensional harmonic oscillator. The statistical error is independent 
of dimensionality for all three sampling weights studied. Time t was 
chosen separately for each D so that C{t) ~ 0.5. 



vibrational spectrum of a 48-dimensional harmonic model 
of the ground electronic state of azulene computed at the 
CASSCF(4,6)/6-31G* level of theory. Obsei-vable A was a 
linear approximation of the dipole moment of azulene, A = 
fj, = fiQ + Ijl' ■ q, where /Xq /^(O) is the equilibrium dipole 
moment (a 3-dimensional vector) and /i' the 3 x D matrix of 
derivatives of the dipole moment at g = 0. Sampling was 
performed the same way as in the previous example. The 
dipole autocorrelation function C (t) was computed intention- 
ally only up to time ttot = 1.45 ps, which is the minimum 
time needed to resolve all vibrational peaks, and with only 
-^imiq = unique trajectories, for which C{t) starts to con- 



Figure 3: The high frequency part of the vibrational spectrum of a 
harmonic model of azulene computed via the Fourier transform of 
the dipole time autocorrelation function. 



verge. Prior to computing the spectrum via a Fourier trans- 
form, C{t) was damped by a multiplication with the function 
cos(7rt/2ttot)^- After the transform, [C (t)] (uj) was multi- 
pHed by the factor 2cj tanh f'^l^^ , which includes the stan- 
dard "quantum correction" 121 for the lack of detailed balance 
in the classical C (<). While this coiTection is not exact even 
for HOs if p is the classical Boltzmann density, it becomes 
exact for harmonic systems if p is the Wigner Boltzmann den- 
sity ( fTSI l. Figure 3, showing the high-frequency region of the 
spectrum containing the C-H bond stretches, confirms that all 
three algorithms converge to the same result (agreeing, within 
the resolution, with the exact spectrum). Moreover, even in 
this slightly more general case than the one considered in 
Fig. 2, the statistical errors associated with all three sam- 
pling weights stayed approximately independent of D. (Sys- 
tems with D < 48 were generated by progressively cutting 
off the lowest frequency normal modes of azulene.) 

Conclusions. We have demonstrated the existence of a sam- 
pling weight for which the number of trajectories needed for 
convergence of the normalized time autocorrelation function 
of any phase-space function A is independent of the dimen- 
sionality and the underlying dynamics of the system. This 
sampling weight is W ~ pA?, which may not be surprising 
at time < = 0, when this W represents the ideal importance 
sampling weight with all trajectories contributing unity to the 
sum (|6]l. Here we have shown that this sampling weight re- 
tains its favorable properties also for t > by proving that 
(TpA2 depends explicitly only on C {t) itself, and not on other 
parameters of the system. 

While best suited for normalized autocorrelation functions, 
weight pA^ can also accelerate calculations of unnormal- 
ized autocoiTelation functions Cu {t) via the relation Cu(<) = 
Cu(0)C(i). In the latter case, weight pA^ is retained for the 
dynamical calculation of C{t), which is usually the most time- 
consuming task by far. The initial norm Cu (0) must be com- 
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puted separately using a normalized sampling weight such as 
p. Yet, one can afford many more trajectories for computing 
Cu (0) since this calculation does not require any dynamics. 

To conclude, we hope that the dimensionality-independent 
sampling weight will find its use in other classical, semiclassi- 
cal ||4|,|5|], and even quantum mechanical traiectory-based ap- 
plications, such as those using the centroid I^I^HI or ring- 
polymer IIT Hllll molecular dynamics. 
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